Method for rapid detection and identification of bioagents

ABSTRACT

Method for detecting and identifying unknown bioagents, including bacteria, viruses and the like, by a combination of nucleic acid amplification and molecular weight determination using primers which hybridize to conserved sequence regions of nucleic acids derived from a bioagent and which bracket variable sequence regions that uniquely identify the bioagent. The result is a “base composition signature” (BCS) which is then matched against a database of base composition signatures, by which the bioagent is identified.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a divisional of U.S. Ser. No. 09/798,007filed Mar. 2, 2001, now abandoned which is incorporated herein byreference in its entirety.

STATEMENT OF GOVERNMENT SUPPORT

This invention was made with United States Government support underDARPA/SPO contract BAA00-09. The United States Government may havecertain rights in the invention.

FIELD OF THE INVENTION

The present invention relates to methods for rapid detection andidentification of bioagents from environmental, clinical or othersamples. The methods provide for detection and characterization of aunique base composition signature (BCS) from any bioagent, includingbacteria and viruses. The unique BCS is used to rapidly identify thebioagent.

BACKGROUND OF THE INVENTION

Rapid and definitive microbial identification is desirable for a varietyof industrial, medical, environmental, quality, and research reasons.Traditionally, the microbiology laboratory has functioned to identifythe etiologic agents of infectious diseases through direct examinationand culture of specimens. Since the mid-1980s, researchers haverepeatedly demonstrated the practical utility of molecular biologytechniques, many of which form the basis of clinical diagnostic assays.Some of these techniques include nucleic acid hybridization analysis,restriction enzyme analysis, genetic sequence analysis, and separationand purification of nucleic acids (See, e.g., J. Sambrook, E. F.Fritsch, and T. Maniatis, Molecular Cloning: A Laboratory Manual, 2ndEd., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.,1989). These procedures, in general, are time-consuming and tedious.Another option is the polymerase chain reaction (PCR) or otheramplification procedure which amplifies a specific target DNA sequencebased on the flanking primers used. Finally, detection and data analysisconvert the hybridization event into an analytical result.

Other techniques for detection of bioagents include high-resolution massspectrometry (MS), low-resolution MS, fluorescence, radioiodination, DNAchips and antibody techniques. None of these techniques is entirelysatisfactory.

Mass spectrometry provides detailed information about the moleculesbeing analyzed, including high mass accuracy. It is also a process thatcan be easily automated. However, high-resolution MS alone fails toperform against unknown or bioengineered agents, or in environmentswhere there is a high background level of bioagents (“cluttered”background). Low-resolution MS can fail to detect some known agents, iftheir spectral lines are sufficiently weak or sufficiently close tothose from other living organisms in the sample. DNA chips with specificprobes can only determine the presence or absence of specificallyanticipated organisms. Because there are hundreds of thousands ofspecies of benign bacteria, some very similar in sequence to threatorganisms, even arrays with 10,000 probes lack the breadth needed todetect a particular organism.

Antibodies face more severe diversity limitations than arrays. Ifantibodies are designed against highly conserved targets to increasediversity, the false alarm problem will dominate, again because threatorganisms are very similar to benign ones. Antibodies are only capableof detecting known agents in relatively uncluttered environments.

Several groups have described detection of PCR products using highresolution electrospray ionization-Fourier transform-ion cyclotronresonance mass spectrometry (ESI-FT-ICR MS). Accurate measurement ofexact mass combined with knowledge of the number of at least onenucleotide allowed calculation of the total base composition for PCRduplex products of approximately 100 base pairs. (Aaserud et al., J. Am.Soc. Mass Spec. 7:1266–1269, 1996; Muddiman et al., Anal. Chem.69:1543–1549, 1997; Wunschel et al., Anal. Chem. 70:1203–1207, 1998;Muddiman et al., Rev. Anal. Chem. 17:1–68, 1998). Electrosprayionization-Fourier transform-ion cyclotron resistance (ESI-FT-ICR) MSmay be used to determine the mass of double-stranded, 500 base-pair PCRproducts via the average molecular mass (Hurst et al., Rapid Commun.Mass Spec. 10:377–382, 1996). The use of matrix-assisted laserdesorption ionization-time of flight (MALDI-TOF) mass spectrometry forcharacterization of PCR products has been described. (Muddiman et al.,Rapid Commun. Mass Spec. 13:1201–1204, 1999). However, the degradationof DNAs over about 75 nucleotides observed with MALDI limited theutility of this method.

U.S. Pat. No. 5,849,492 describes a method for retrieval ofphylogenetically informative DNA sequences which comprise searching fora highly divergent segment of genomic DNA surrounded by two highlyconserved segments, designing the universal primers for PCRamplification of the highly divergent region, amplifying the genomic DNAby PCR technique using universal primers, and then sequencing the geneto determine the identity of the organism.

U.S. Pat. No. 5,965,363 discloses methods for screening nucleic acidsfor polymorphisms by analyzing amplified target nucleic acids using massspectrometric techniques and to procedures for improving mass resolutionand mass accuracy of these methods.

WO 99/14375 describes methods, PCR primers and kits for use in analyzingpreselected DNA tandem nucleotide repeat alleles by mass spectrometry.

WO 98/12355 discloses methods of determining the mass of a targetnucleic acid by mass spectrometric analysis, by cleaving the targetnucleic acid to reduce its length, making the target single-stranded andusing MS to determine the mass of the single-stranded shortened target.Also disclosed are methods of preparing a double-stranded target nucleicacid for MS analysis comprising amplification of the target nucleicacid, binding one of the strands to a solid support, releasing thesecond strand and then releasing the first strand which is then analyzedby MS. Kits for target nucleic acid preparation are also provided.

PCT WO97/33000 discloses methods for detecting mutations in a targetnucleic acid by nonrandomly fragmenting the target into a set ofsingle-stranded nonrandom length fragments and determining their massesby MS.

U.S. Pat. No. 5,605,798 describes a fast and highly accurate massspectrometer-based process for detecting the presence of a particularnucleic acid in a biological sample for diagnostic purposes.

WO 98/21066 describes processes for determining the sequence of aparticular target nucleic acid by mass spectrometry. Processes fordetecting a target nucleic acid present in a biological sample by PCRamplification and mass spectrometry detection are disclosed, as aremethods for detecting a target nucleic acid in a sample by amplifyingthe target with primers that contain restriction sites and tags,extending and cleaving the amplified nucleic acid, and detecting thepresence of extended product, wherein the presence of a DNA fragment ofa mass different from wild-type is indicative of a mutation. Methods ofsequencing a nucleic acid via mass spectrometry methods are alsodescribed.

WO 97/37041, WO 99/31278 and U.S. Pat. No. 5,547,835 describe methods ofsequencing nucleic acids using mass spectrometry. U.S. Pat. Nos.5,622,824, 5,872,003 and 5,691,141 describe methods, systems and kitsfor exonuclease-mediated mass spectrometric sequencing.

Thus, there is a need for a method for bioagent detection andidentification which is both specific and rapid, and in which no nucleicacid sequencing is required. The present invention addresses this need.

SUMMARY OF THE INVENTION

One embodiment of the present invention is a method of identifying anunknown bioagent comprising (a) contacting nucleic acid from thebioagent with at least one pair of oligonucleotide primers whichhybridize to sequences of the nucleic acid and flank a variable nucleicacid sequence; (b) amplifying the variable nucleic acid sequence toproduce an amplification product; (c) determining the molecular mass ofthe amplification product; and (d) comparing the molecular mass to oneor more molecular masses of amplification products obtained byperforming steps (a)–(c) on a plurality of known organisms, wherein amatch identifies the unknown bioagent. In one aspect of this preferredembodiment, the sequences to which the at least one pair ofoligonucleotide primers hybridize are highly conserved. Preferably, theamplifying step comprises polymerase chain reaction. Alternatively, theamplifying step comprises ligase chain reaction or strand displacementamplification. In one aspect of this preferred embodiment, the bioagentis a bacterium, virus, cell or spore. Advantageously, the nucleic acidis ribosomal RNA. In another aspect, the nucleic acid encodes RNase P oran RNA-dependent RNA polymerase. Preferably, the amplification productis ionized prior to molecular mass determination. The method may furthercomprise the step of isolating nucleic acid from the bioagent prior tocontacting the nucleic acid with the at least one pair ofoligonucleotide primers. The method may further comprise the step ofperforming steps (a)–(d) using a different oligonucleotide primer pairand comparing the results to one or more molecular mass amplificationproducts obtained by performing steps (a)–(c) on a different pluralityof known organisms from those in step (d). Preferably, the one or moremolecular mass is contained in a database of molecular masses. Inanother aspect of this preferred embodiment, the amplification productis ionized by electrospray ionization, matrix-assisted laser desorptionor fast atom bombardment. Advantageously, the molecular mass isdetermined by mass spectrometry. Preferably, the mass spectrometry isFourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS),ion trap, quadrupole, magnetic sector, time of flight (TOF), Q-TOF ortriple quadrupole. The method may further comprise performing step (b)in the presence of an analog of adenine, thymidine, guanosine orcytidine having a different molecular weight than adenosine, thymidine,guanosine or cytidine. In one aspect, the oligonucleotide primercomprises a base analog or substitute base at positions 1 and 2 of eachtriplet within the primer, wherein the base analog or substitute basebinds with increased affinity to its complement compared to the nativebase. Preferably, the primer comprises a universal base at position 3 ofeach triplet within the primer. The base analog or substitute base maybe 2,6-diaminopurine, propyne T, propyne G, phenoxazines or G-clamp.Preferably, the universal base is inosine, guanidine, uridine,5-nitroindole, 3-nitropyrrole, dP or dK, or1-(2-deoxy-β-D-ribofuranosyl)-imidazole-4-carboxamide.

Another embodiment of the present invention is a method of identifyingan unknown bioagent comprising (a) contacting nucleic acid from thebioagent with at least one pair of oligonucleotide primers whichhybridize to sequences of the nucleic acid and flank a variable nucleicacid sequence; (b) amplifying the variable nucleic acid sequence toproduce an amplification product; (c) determining the base compositionof the amplification product; and (d) comparing the base composition toone or more base compositions of amplification products obtained byperforming steps (a)–(c) on a plurality of known organisms, wherein amatch identifies the unknown bioagent. In one aspect of this preferredembodiment, the sequences to which the at least one pair ofoligonucleotide primers hybridize are highly conserved. Preferably, theamplifying step comprises polymerase chain reaction. Alternatively, theamplifying step comprises ligase chain reaction or strand displacementamplification. In one aspect of this preferred embodiment, the bioagentis a bacterium, virus, cell or spore. Advantageously, the nucleic acidis ribosomal RNA. In another aspect, the nucleic acid encodes RNase P oran RNA-dependent RNA polymerase. Preferably, the amplification productis ionized prior to molecular mass determination. The method may furthercomprise the step of isolating nucleic acid from the bioagent prior tocontacting the nucleic acid with the at least one pair ofoligonucleotide primers. The method may further comprise the step ofperforming steps (a)–(d) using a different oligonucleotide primer pairand comparing the results to one or more base composition signatures ofamplification products obtained by performing steps (a)–(c) on adifferent plurality of known organisms from those in step (d).Preferably, the one or more base compositions is contained in a databaseof base compositions. In another aspect of this preferred embodiment,the amplification product is ionized by electrospray ionization,matrix-assisted laser desorption or fast atom bombardment.Advantageously, the molecular mass is determined by mass spectrometry.Preferably, the mass spectrometry is Fourier transform ion cyclotronresonance mass spectrometry (FT-ICR-MS), ion trap, quadrupole, magneticsector, time of flight (TOF), Q-TOF or triple quadrupole. The method mayfurther comprise performing step (b) in the presence of an analog ofadenine, thymidine, guanosine or cytidine having a different molecularweight than adenosine, thymidine, guanosine or cytidine. In one aspect,the oligonucleotide primer comprises a base analog or substitute base atpositions 1 and 2 of each triplet within the primer, wherein the baseanalog or substitute base binds with increased affinity to itscomplement compared to the native base. Preferably, the primer comprisesa universal base at position 3 of each triplet within the primer. Thebase analog or substitute base may be 2,6-diaminopurine, propyne T,propyne G, phenoxazines or G-clamp. Preferably, the universal base isinosine, guanidine, uridine, 5-nitroindole, 3-nitropyrrole, dP or dK, or1-(2-deoxy-β-D-ribofuranosyl)-imidazole-4-carboxamide.

The present invention also provides a method for detecting a singlenucleotide polymorphism in an individual, comprising the steps of (a)isolating nucleic acid from the individual; (b) contacting the nucleicacid with oligonucleotide primers which hybridize to regions of thenucleic acid which flank a region comprising the potential polymorphism;(c) amplifying the region to produce an amplification product; (d)determining the molecular mass of the amplification product; and (e)comparing the molecular mass to the molecular mass of the region in anindividual known to have the polymorphism, wherein if the molecularmasses are the same then the individual has the polymorphism.

In one aspect of this preferred embodiment, the primers hybridize tohighly conserved sequences. Preferably, the polymorphism is associatedwith a disease. Alternatively, the polymorphism is a blood groupantigen. In one aspect of the preferred embodiment, the amplifying stepis polymerase chain reaction. Alternatively, the amplification step isligase chain reaction or strand displacement amplification. Preferably,the amplification product is ionized prior to mass determination. In oneaspect, the amplification product is ionized by electrospray ionization,matrix-assisted laser desorption or fast atom bombardment.Advantageously, the molecular mass is determined by mass spectrometry.Preferably, the mass spectrometry is Fourier transform ion cyclotronresonance mass spectrometry (FT-ICR-MS), ion trap, quadrupole, magneticsector, time of flight (TOF), Q-TOF or triple quadrupole.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A–1H and FIG. 2 are consensus diagrams that show examples ofconserved regions from 16S rRNA (FIG. 1A-1, 1A-2, 1A-3, 1A-4, and 1A-5),23S rRNA (3′-half, FIG. 1B, 1C, and 1D, 5′-half, FIG. 1E–F), 23S rRNADomain I (FIG. 1G), 23S rRNA Domain IV (FIG. 1H) and 16S rRNA Domain III(FIG. 2) which are suitable for use in the present invention. Wherethere is overlap or redundancy between the figures, the overlap issimply provided as an orientation aid and no additional members of thesequence are implied thereby. Lines with arrows are examples of regionsto which intelligent primer pairs for PCR are designed. The label foreach primer pair represents the starting and ending base number of theamplified region on the consensus diagram. Bases in capital letters aregreater than 95% conserved; bases in lower case letters are 90–95%conserved, filled circles are 80–90% conserved; and open circles areless than 80% conserved. The label for each primer pair represents thestarting and ending base number of the amplified region on the consensusdiagram. The nucleotide sequence of the 16S rRNA consensus sequence isSEQ ID NO:3 and the nucleotide sequence of the 23S rRNA consensussequence is SEQ ID NO:4.

FIG. 2 shows a typical primer amplified region from the 16S rRNA DomainIII shown in FIG. 1A-1.

FIG. 3 is a schematic diagram showing conserved regions in RNase P.Bases in capital letters are greater than 90% conserved; bases in lowercase letters are 80–90% conserved; filled circles designate bases whichare 70–80% conserved; and open circles designate bases that are lessthan 70% conserved.

FIG. 4 is a schematic diagram of base composition signaturedetermination using nucleotide analog “tags” to determine basecomposition signatures.

FIG. 5 shows the deconvoluted mass spectra of a Bacillus anthracisregion with and without the mass tag phosphorothioate A (A*). The twospectra differ in that the measured molecular weight of the masstag-containing sequence is greater than the unmodified sequence.

FIG. 6 shows base composition signature (BCS) spectra from PCR productsfrom Staphylococcus aureus (S. aureus 16S_(—)1337F) and Bacillusanthracus (B. anthr. 16S_(—)1337F), amplified using the same primers.The two strands differ by only two (AT→CG) substitutions and are clearlydistinguished on the basis of their BCS.

FIG. 7 shows that a single difference between two sequences (A₁₄ in B.anthracis vs. A₁₅ in B. cereus) can be easily detected using ESI-TOFmass spectrometry.

FIG. 8 is an ESI-TOF of Bacillus anthracis spore coat protein sspE 56merplus calibrant. The signals unambiguously identify B. anthracis versusother Bacillus species.

FIG. 9 is an ESI-TOF of a B. anthracis synthetic 16S_(—)1228 duplex(reverse and forward strands). The technique easily distinguishesbetween the forward and reverse strands.

FIG. 10 is an ESI-FTICR-MS of a synthetic B. anthracis 16S_(—)1337 46base pair duplex.

FIG. 11 is an ESI-TOF-MS of a 56mer oligonucleotide (3 scans) from theB. anthracis saspB gene with an internal mass standard. The internalmass standards are designated by asterisks.

FIG. 12 is an ESI-TOF-MS of an internal standard with 5 mM TBA-TFAbuffer showing that charge stripping with tributylammoniumtrifluoroacetate reduces the most abundant charge state from [M-8H⁺]⁸⁻to [M-3H⁺]³⁻.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides a combination of a non-PCR biomassdetection mode, preferably high-resolution MS, with PCR-based BCStechnology using “intelligent primers” which hybridize to conservedsequence regions of nucleic acids derived from a bioagent and whichbracket variable sequence regions that uniquely identify the bioagent.The high-resolution MS technique is used to determine the molecular massand base composition signature (BCS) of the amplified sequence region.This unique “base composition signature” (BCS) is then input to amaximum-likelihood detection algorithm for matching against a databaseof base composition signatures in the same amplified region. The presentmethod combines PCR-based amplification technology (which providesspecificity) and a molecular mass detection mode (which provides speedand does not require nucleic acid sequencing of the amplified targetsequence) for bioagent detection and identification.

The present method allows extremely rapid and accurate detection andidentification of bioagents compared to existing methods. Furthermore,this rapid detection and identification is possible even when samplematerial is impure. Thus, the method is useful in a wide variety offields, including, but not limited to, environmental testing (e.g.,detection and discrimination of pathogenic vs. non-pathogenic bacteriain water or other samples), germ warfare (allowing immediateidentification of the bioagent and appropriate treatment),pharmacogenetic analysis and medical diagnosis (including cancerdiagnosis based on mutations and polymorphisms; drug resistance andsusceptibility testing; screening for and/or diagnosis of geneticdiseases and conditions; and diagnosis of infectious diseases andconditions). The method leverages ongoing biomedical research invirulence, pathogenicity, drug resistance and genome sequencing into amethod which provides greatly improved sensitivity, specificity andreliability compared to existing methods, with lower rates of falsepositives.

The present method can be used to detect and classify any biologicalagent, including bacteria, viruses, fungi and toxins. As one example,where the agent is a biological threat, the information obtained is usedto determine practical information needed for countermeasures, includingtoxin genes, pathogenicity islands and antibiotic resistance genes. Inaddition, the methods can be used to identify natural or deliberateengineering events including chromosome fragment swapping, molecularbreeding (gene shuffling) and emerging infectious diseases.

Bacteria have a common set of absolutely required genes. About 250 genesare present in all bacterial species (Proc. Natl. Acad. Sci. U.S.A.93:10268, 1996; Science 270:397, 1995), including tiny genomes likeMycoplasma, Ureaplasma and Rickettsia. These genes encode proteinsinvolved in translation, replication, recombination and repair,transcription, nucleotide metabolism, amino acid metabolism, lipidmetabolism, energy generation, uptake, secretion and the like. Examplesof these proteins are DNA polymerase III beta, elongation factor TU,heat shock protein groEL, RNA polymerase beta, phosphoglycerate kinase,NADH dehydrogenase, DNA ligase, DNA topoisomerase and elongation factorG. Operons can also be targeted using the present method. One example ofan operon is the bfp operon from enteropathogenic E. coli. Multiple corechromosomal genes can be used to classify bacteria at a genus or genusspecies level to determine if an organism has threat potential. Themethod can also be used to detect pathogenicity markers (plasmid orchromosomal) and antibiotic resistance genes to confirm the threatpotential of an organism and to direct countermeasures.

A theoretically ideal bioagent detector would identify, quantify, andreport the complete nucleic acid sequence of every bioagent that reachedthe sensor. The complete sequence of the nucleic acid component of apathogen would provide all relevant information about the threat,including its identity and the presence of drug-resistance orpathogenicity markers. This ideal has not yet been achieved. However,the present invention provides a straightforward strategy for obtaininginformation with the same practical value using base compositionsignatures (BCS). While the base composition of a gene fragment is notas information-rich as the sequence itself, there is no need to analyzethe complete sequence of the gene if the short analyte sequence fragmentis properly chosen. A database of reference sequences can be prepared inwhich each sequence is indexed to a unique base composition signature,so that the presence of the sequence can be inferred with accuracy fromthe presence of the signature. The advantage of base compositionsignatures is that they can be quantitatively measured in a massivelyparallel fashion using multiplex PCR (PCR in which two or more primerpairs amplify target sequences simultaneously) and mass spectrometry.These multiple primer amplified regions uniquely identify most threatand ubiquitous background bacteria and viruses. In addition,cluster-specific primer pairs distinguish important local clusters(e.g., anthracis group).

In the context of this invention, a “bioagent” is any organism, livingor dead, or a nucleic acid derived from such an organism. Examples ofbioagents include but are not limited to cells (including but notlimited to human clinical samples, bacterial cells and other pathogens)viruses, toxin genes and bioregulating compounds). Samples may be aliveor dead or in a vegetative state (for example, vegetative bacteria orspores) and may be encapsulated or bioengineered.

As used herein, a “base composition signature” (BCS) is the exact basecomposition from selected fragments of nucleic acid sequences thatuniquely identifies the target gene and source organism. BCS can bethought of as unique indexes of specific genes.

As used herein, “intelligent primers” are primers which bind to sequenceregions which flank an intervening variable region. In a preferredembodiment, these sequence regions which flank the variable region arehighly conserved among different species of bioagent. For example, thesequence regions may be highly conserved among all Bacillus species. Bythe term “highly conserved”, it is meant that the sequence regionsexhibit between about 80–100%, more preferably between about 90–100% andmost preferably between about 95–100% identity. Examples of intelligentprimers which amplify regions of the 16S and 23S rRNA are shown in FIGS.1A–1H. A typical primer amplified region in 16S rRNA is shown in FIG. 2.The arrows represent primers which bind to highly conserved regionswhich flank a variable region in 16S rRNA domain III. The amplifiedregion is the stem-loop structure under “1100–1188.”

One main advantage of the detection methods of the present invention isthat the primers need not be specific for a particular bacterialspecies, or even genus, such as Bacillus or Streptomyces. Instead, theprimers recognize highly conserved regions across hundreds of bacterialspecies including, but not limited to, the species described herein.Thus, the same primer pair can be used to identify any desired bacteriumbecause it will bind to the conserved regions which flank a variableregion specific to a single species, or common to several bacterialspecies, allowing nucleic acid amplification of the intervening sequenceand determination of its molecular weight and base composition. Forexample, the 16S_(—)971–1062, 16S_(—)1228–1310 and 16S_(—)1100–1188regions are 98–99% conserved in about 900 species of bacteria (16S=16SrRNA, numbers indicate nucleotide position). In one embodiment of thepresent invention, primers used in the present method bind to one ormore of these regions or portions thereof.

The present invention provides a combination of a non-PCR biomassdetection mode, preferably high-resolution MS, with nucleic acidamplification-based BCS technology using “intelligent primers” whichhybridize to conserved regions and which bracket variable regions thatuniquely identify the bioagent(s). Although the use of PCR is preferred,other nucleic acid amplification techniques may also be used, includingligase chain reaction (LCR) and strand displacement amplification (SDA).The high-resolution MS technique allows separation of bioagent spectrallines from background spectral lines in highly cluttered environments.The resolved spectral lines are then translated to BCS which are inputto a maximum-likelihood detection algorithm matched against spectra forone or more known BCS. Preferably, the bioagent BCS spectrum is matchedagainst one or more databases of BCS from vast numbers of bioagents.Preferably, the matching is done using a maximum-likelihood detectionalgorithm.

In a preferred embodiment, base composition signatures arequantitatively measured in a massively parallel fashion using thepolymerase chain reaction (PCR), preferably multiplex PCR, and massspectrometric (MS) methods. Sufficient quantities of nucleic acids mustbe present for detection of bioagents by MS. A wide variety oftechniques for preparing large amounts of purified nucleic acids orfragments thereof are well known to those of skill in the art. PCRrequires one or more pairs of oligonucleotide primers which bind toregions which flank the target sequence(s) to be amplified. Theseprimers prime synthesis of a different strand of DNA, with synthesisoccurring in the direction of one primer towards the other primer. Theprimers, DNA to be amplified, a thermostable DNA polymerase (e.g. Taqpolymerase), the four deoxynucleotide triphosphates, and a buffer arecombined to initiate DNA synthesis. The solution is denatured byheating, then cooled to allow annealing of newly added primer, followedby another round of DNA synthesis. This process is typically repeatedfor about 30 cycles, resulting in amplification of the target sequence.

The “intelligent primers” define the target sequence region to beamplified and analyzed. In one embodiment, the target sequence is aribosomal RNA (rRNA) gene sequence. With the complete sequences of manyof the smallest microbial genomes now available, it is possible toidentify a set of genes that defines “minimal life” and identifycomposition signatures that uniquely identify each gene and organism.Genes that encode core life functions such as DNA replication,transcription, ribosome structure, translation, and transport aredistributed broadly in the bacterial genome and are prefeffed regionsfor BCS analysis. Ribosomal RNA (rRNA) genes comprise regions thatprovide useful base composition signatures. Like many genes involved incore life functions, rRNA genes contain sequences that areextraordinarily conserved across bacterial domains interspersed withregions of high variability that are more specific to each species. Thevariable regions can be utilized to build a database of base compositionsignatures. The strategy involves creating a structure-based alignmentof sequences of the small (16S) and the large (23S) subunits of the rRNAgenes. For example, there are currently over 13,000 sequences in theribosomal RNA database that has been created and maintained by RobinGutell, University of Texas at Austin, and is publicly available on theInstitute for Cellular and Molecular Biology web page(www.rna.icmb.utexas.edul). There is also a publicly available rRNAdatabase created and maintained by the University of Antwerp, Belgium atwww.rrna.uia.ac.be.

These databases have been analyzed to determine regions that are usefulas base composition signatures. The characteristics of such regions are:a) between about 80 and 100%, preferably > about 95% identity amongspecies of the particular bioagent of interest, of upstream anddownstream nucleotide sequences which serve as sequence amplificationprimer sites; b) an intervening variable region which exhibits nogreater than about 5% identity among species; and c) a separation ofbetween about 30 and 1000 nucleotides, preferably no more than about50–250 nucleotides, and more preferably no more than about 60–100nucleotides, between the conserved regions.

Due to their overall conservation, the flanking rRNA primer sequencesserve as good “universal” primer binding sites to amplify the region ofinterest for most, if not all, bacterial species. The intervening regionbetween the sets of primers varies in length and/or composition, andthus provides a unique base composition signature.

It is advantageous to design the “intelligent primers” to be asuniversal as possible to minimize the number of primers which need to besynthesized, and to allow detection of multiple species using a singlepair of primers. These primer pairs can be used to amplify variableregions in these species. Because any variation (due to codon wobble inthe 3^(rd) position) in these conserved regions among species is likelyto occur in the third position of a DNA triplet, oligonucleotide primerscan be designed such that the nucleotide corresponding to this positionis a base which can bind to more than one nucleotide, referred to hereinas a “universal base”. For example, under this “wobble” pairing, inosine(I) binds to U, C or A; guanine (G) binds to U or C, and uridine (U)binds to U or C. Other examples of universal bases include nitroindolessuch as 5-nitroindole or 3-nitropyrrole (Loakes et al., Nucleosides andNucleotides 14:1001–1003, 1995), the degenerate nucleotides dP or dK(Hill et al.), an acyclic nucleoside analog containing 5-nitroindazole(Van Aerschot et al., Nucleosides and Nucleotides 14:1053–1056, 1995) orthe purine analog 1-(2-deoxy-β-D-ribofuranosyl)-imidazole-4-carboxamide(Sala et al., Nucl. Acids Res. 24:3302–3306, 1996).

In another embodiment of the invention, to compensate for the somewhatweaker binding by the “wobble” base, the oligonucleotide primers aredesigned such that the first and second positions of each triplet areoccupied by nucleotide analogs which bind with greater affinity than theunmodified nucleotide. Examples of these analogs are 2,6-diaminopurinewhich binds to thymine, propyne T which binds to adenine and propyne Cand phenoxazines, including G-clamp, which binds to G. Propynes aredescribed in U.S. Pat. Nos. 5,645,985, 5,830,653 and 5,484,908, theentire contents of which are incorporated herein by reference.Phenoxazines are described in U.S. Pat. Nos. 5,502,177, 5,763,588, and6,005,096, the entire contents of which are incorporated herein byreference. G-clamps are described in U.S. Pat. Nos. 6,007,992 and6,028,183, the entire contents of which are incorporated herein byreference.

Bacterial biological warfare agents capable of being detected by thepresent methods include Bacillus anthracis (anthrax), Yersinia pestis(pneumonic plague), Franciscella tularensis (tularemia), Brucella suis,Brucella abortus, Brucella melitensis (undulant fever), Burkholderiamallei (glanders), Burkholderia pseudomalleii (melioidosis), Salmonellatyphi (typhoid fever), Rickettsia typhii (epidemic typhus), Rickettsiaprowasekii (endemic typhus) and Coxiella burnetii (Q fever), Rhodobactercapsulatus, Chlamydia pneumoniae, Escherichia coli, Shigelladysenteriae, Shigella flexneri, Bacillus cereus, Clostridium botulinum,Coxiella burnetti, Pseudomonas aeruginosa, Legionella pneumophila, andVibrio cholerae.

Besides 16S and 23S rRNA, other target regions suitable for use in thepresent invention for detection of bacteria include 5S rRNA and RNase P(FIG. 3).

Biological warfare fungus biowarfare agents include coccidioides immitis(Coccidioidomycosis).

Biological warfare toxin genes capable of being detected by the methodsof the present invention include botulism, T-2 mycotoxins, ricin, staphenterotoxin B, shigatoxin, abrin, aflatoxin, Clostridium perfringensepsilon toxin, conotoxins, diacetoxyscirpenol, tetrodotoxin andsaxitoxin.

Biological warfare viral threat agents are mostly RNA viruses(positive-strand and negative-strand), with the exception of smallpox.Every RNA virus is a family of related viruses (quasispecies). Theseviruses mutate rapidly and the potential for engineered strains (naturalor deliberate) is very high. RNA viruses cluster into families that haveconserved RNA structural domains on the viral genome (e.g., virioncomponents, accessory proteins) and conserved housekeeping genes thatencode core viral proteins including, for single strand positive strandRNA viruses, RNA-dependent RNA polymerase, double stranded RNA helicase,chymotrypsin-like and papain-like proteases and methyltransferases.

Examples of (−)-strand RNA viruses include arenaviruses (e.g., sabiavirus, lassa fever, Machupo, Argentine hemorrhagic fever, flexal virus),bunyaviruses (e.g., hantavirus, nairovirus, phlebovirus, hantaan virus,Congo-crimean hemorrhagic fever, rift valley fever), and mononegavirales(e.g., filovirus, paramyxovirus, ebola virus, Marburg, equinemorbillivirus).

Examples of (+)-strand RNA viruses include picornaviruses (e.g.,coxsackievirus, echovirus, human coxsackievirus A, human echovirus,human enterovirus, human poliovirus, hepatitis A virus, humanparechovirus, human rhinovirus), astroviruses (e.g., human astrovirus),calciviruses (e.g., chiba virus, chitta virus, human calcivirus, norwalkvirus), nidovirales (e.g., human coronavirus, human torovirus),flaviviruses (e.g., dengue virus 1–4, Japanese encephalitis virus,Kyanasur forest disease virus, Murray Valley encephalitis virus, Rociovirus, St. Louis encephalitis virus, West Nile virus, yellow fevervirus, hepatitis c virus) and togaviruses (e.g., Chikugunya virus,Eastern equine encephalitis virus, Mayaro virus, O'nyong-nyong virus,Ross River virus, Venezuelan equine encephalitis virus, Rubella virus,hepatitis E virus). The hepatitis C virus has a 5′-untranslated regionof 340 nucleotides, an open reading frame encoding 9 proteins having3010 amino acids and a 3′-untranslated region of 240 nucleotides. The5′-UTR and 3′-UTR are 99% conserved in hepatitis C viruses.

In one embodiment, the target gene is an RNA-dependent RNA polymerase ora helicase encoded by (+)-strand RNA viruses, or RNA polymerase from a(−)-strand RNA virus. (+)-strand RNA viruses are double stranded RNA andreplicate by RNA-directed RNA synthesis using RNA-dependent RNApolymerase and the positive strand as a template. Helicase unwinds theRNA duplex to allow replication of the single stranded RNA. Theseviruses include viruses from the family picornaviridae (e.g.,poliovirus, coxsackievirus, echovirus), togaviridae (e.g., alphavirus,flavivirus, rubivirus), arenaviridae (e.g., lymphocytic choriomeningitisvirus, lassa fever virus), cononaviridae (e.g., human respiratory virus)and Hepatitis A virus. The genes encoding these proteins comprisevariable and highly conserved regions which flank the variable regions.

In a preferred embodiment, the detection scheme for the PCR productsgenerated from the bioagent(s) incorporates three features. First, thetechnique simultaneously detects and differentiates multiple (generallyabout 6–10) PCR products. Second, the technique provides a BCS thatuniquely identifies the bioagent from the possible primer sites.Finally, the detection technique is rapid, allowing multiple PCRreactions to be run in parallel.

In one embodiment, the method can be used to detect the presence ofantibiotic resistance and/or toxin genes in a bacterial species. Forexample, Bacillus anthracis comprising a tetracycline resistance plasmidand plasmids encoding one or both anthracis toxins (px01 and/or px02)can be detected by using antibiotic resistance primer sets and toxingene primer sets. If the B. anthracis is positive for tetracyclineresistance, then a different antibiotic, for example quinalone, is used.

Mass spectrometry (MS)-based detection of PCR products provides all ofthese features with additional advantages. MS is intrinsically aparallel detection scheme without the need for radioactive orfluorescent labels, since every amplification product with a unique basecomposition is identified by its molecular mass. The current state ofthe art in mass spectrometry is such that less than femtomole quantitiesof material can be readily analyzed to afford information about themolecular contents of the sample. An accurate assessment of themolecular mass of the material can be quickly obtained, irrespective ofwhether the molecular weight of the sample is several hundred, or inexcess of one hundred thousand atomic mass units (amu) or Daltons.Intact molecular ions can be generated from amplification products usingone of a variety of ionization techniques to convert the sample to gasphase. These ionization methods include, but are not limited to,electrospray ionization (ES), matrix-assisted laser desorptionionization (MALDI) and fast atom bombardment (FAB). For example, MALDIof nucleic acids, along with examples of matrices for use in MALDI ofnucleic acids, are described in WO 98/54751 (Genetrace, Inc.).

Upon ionization, several peaks are observed from one sample due to theformation of ions with different charges. Averaging the multiplereadings of molecular mass obtained from a single mass spectrum affordsan estimate of molecular mass of the bioagent. Electrospray ionizationmass spectrometry (ESI-MS) is particularly useful for very highmolecular weight polymers such as proteins and nucleic acids havingmolecular weights greater than 10 kDa, since it yields a distribution ofmultiply-charged molecules of the sample without causing a significantamount of fragmentation.

The mass detectors used in the methods of the present invention include,but are not limited to, Fourier transform ion cyclotron resonance massspectrometry (FT-ICR-MS), ion trap, quadrupole, magnetic sector, time offlight (TOF), Q-TOF, and triple quadrupole.

In general, the mass spectrometric techniques which can be used in thepresent invention include, but are not limited to, tandem massspectrometry, infrared multiphoton dissociation and pyrolytic gaschromatography mass spectrometry (PGC-MS). In one embodiment of theinvention, the bioagent detection system operates continually inbioagent detection mode using pyrolytic GC-MS without PCR for rapiddetection of increases in biomass (for example, increases in fecalcontamination of drinking water or of germ warfare agents). To achieveminimal latency, a continuous sample stream flows directly into thePGC-MS combustion chamber. When an increase in biomass is detected, aPCR process is automatically initiated. Bioagent presence produceselevated levels of large molecular fragments from 100–7,000 Da which areobserved in the PGC-MS spectrum. The observed mass spectrum is comparedto a threshold level and when levels of biomass are determined to exceeda predetermined threshold, the bioagent classification process describedhereinabove(combining PCR and MS, preferably FT-ICR MS) is initiated.Optionally, alarms or other processes (halting ventilation flow,physical isolation) are also initiated by this detected biomass level.The accurate measurement of molecular mass for large DNAs is limited bythe adduction of cations from the PCR reaction to each strand,resolution of the isotopic peaks from natural abundance ¹³C and ¹⁵Nisotopes, and assignment of the charge state for any ion. The cationsare removed by in-line dialysis using a flow-through chip that bringsthe solution containing the PCR products into contact with a solutioncontaining ammonium acetate in the presence of an electric fieldgradient orthogonal to the flow. The latter two problems are addressedby operating with a resolving power of >100,000 and by incorporatingisotopically depleted nucleotide triphosphates into the DNA. Theresolving power of the instrument is also a consideration. At aresolving power of 10,000, the modeled signal from the [M-14H+]¹⁴⁻charge state of an 84mer PCR product is poorly characterized andassignment of the charge state or exact mass is impossible. At aresolving power of 33,000, the peaks from the individual isotopiccomponents are visible. At a resolving power of 100,000, the isotopicpeaks are resolved to the baseline and assignment of the charge statefor the ion is straightforward. The [¹³C, ¹⁵N]-depleted triphosphatesare obtained, for example, by growing microorganisms on depleted mediaand harvesting the nucleotides (Batey et al., Nucl. Acids Res.20:4515–4523, 1992).

While mass measurements of intact nucleic acid regions are believed tobe adequate to determine most bioagents, tandem mass spectrometry(MS^(n)) techniques may provide more definitive information pertainingto molecular identity or sequence. Tandem MS involves the coupled use oftwo or more stages of mass analysis where both the separation anddetection steps are based on mass spectrometry. The first stage is usedto select an ion or component of a sample from which further structuralinformation is to be obtained. The selected ion is then fragmentedusing, e.g., blackbody irradiation, infrared multiphoton dissociation,or collisional activation. For example, ions generated by electrosprayionization (ESI) can be fragmented using IR multiphoton dissociation.This activation leads to dissociation of glycosidic bonds and thephosphate backbone, producing two series of fragment ions, called thew-series (having an intact 3′ terminus and a 5′ phosphate followinginternal cleavage) and the a-Base series(having an intact 5′ terminusand a 3′ furan).

The second stage of mass analysis is then used to detect and measure themass of these resulting fragments of product ions. Such ion selectionfollowed by fragmentation routines can be performed multiple times so asto essentially completely dissect the molecular sequence of a sample.

If there are two or more targets of similar base composition or mass, orif a single amplification reaction results in a product which has thesame mass as two or more bioagent reference standards, they can bedistinguished by using mass-modifying “tags.” In this embodiment of theinvention, a nucleotide analog or “tag” is incorporated duringamplification (e.g., a 5-(trifluoromethyl) deoxythymidine triphosphate)which has a different molecular weight than the unmodified base so as toimprove distinction of masses. Such tags are described in, for example,PCT WO97/33000. This further limits the number of possible basecompositions consistent with any mass. For example,5-(trifluoromethyl)deoxythymidine triphosphate can be used in place ofdTTP in a separate nucleic acid amplification reaction. Measurement ofthe mass shift between a conventional amplification product and thetagged product is used to quantitate the number of thymidine nucleotidesin each of the single strands. Because the strands are complementary,the number of adenosine nucleotides in each strand is also determined.

In another amplification reaction, the number of G and C residues ineach strand is determined using, for example, the cytidine analog5-methylcytosine (5-meC) or propyne C. The combination of the A/Treaction and G/C reaction, followed by molecular weight determination,provides a unique base composition. This method is summarized in FIG. 4and Table 1.

TABLE 1 Total Total Total Base Base base base Double Single Δmass infoinfo comp. comp. strand strand this this other Top Bottom Mass tagsequence sequence strand strand strand strand strand T*ΔmassT*ACGT*ACGT* T*ACGT*ACGT* 3x 3T 3A 3T 3A (T* − T) = x AT*GCAT*GCA 2A 2T2C 2G 2G 2C AT*GCAT*GCA 2x 2T 2A C*Δmass TAC*GTAC*GT TAC*GTAC*GT 2x 2C2G (C* − C) = y ATGC*ATGC*A ATGC*ATGC*A 2x 2C 2G

The mass tag phosphorothioate A (A*) was used to distinguish a Bacillusanthracis cluster. The B. anthracis (A₁₄G₉C₁₄T₉) had an average MW of14072.26, and the B. anthracis (A₁A*₁₃G₉C₁₄T₉) had an average molecularweight of 14281.11 and the phosphorothioate A had an average molecularweight of +16.06 as determined by ESI-TOF MS. The deconvoluted spectraare shown in FIG. 5.

In another example, assume the measured molecular masses of each strandare 30,000.115 Da and 31,000.115 Da respectively, and the measurednumber of dT and dA residues are (30,28) and (28,30). If the molecularmass is accurate to 100 ppm, there are 7 possible combinations of dG+dCpossible for each strand. However, if the measured molecular mass isaccurate to 10 ppm, there are only 2 combinations of dG+dC, and at 1 ppmaccuracy there is only one possible base composition for each strand.

Signals from the mass spectrometer may be input to a maximum-likelihooddetection and classification algorithm such as is widely used in radarsignal processing. The detection processing uses matched filtering ofBCS observed in mass-basecount space and allows for detection andsubtraction of signatures from known, harmless organisms, and fordetection of unknown bioagent threats. Comparison of newly observedbioagents to known bioagents is also possible, for estimation of threatlevel, by comparing their BCS to those of known organisms and to knownforms of pathogenicity enhancement, such as insertion of antibioticresistance genes or toxin genes.

Processing may end with a Bayesian classifier using log likelihoodratios developed from the observed signals and average backgroundlevels. The program emphasizes performance predictions culminating inprobability-of-detection versus probability-of-false-alarm plots forconditions involving complex backgrounds of naturally occurringorganisms and environmental contaminants. Matched filters consist of apriori expectations of signal values given the set of primers used foreach of the bioagents. A genomic sequence database (e.g. GenBank) isused to define the mass basecount matched filters. The database containsknown threat agents and benign background organisms. The latter is usedto estimate and subtract the signature produced by the backgroundorganisms. A maximum likelihood detection of known background organismsis implemented using matched filters and a running-sum estimate of thenoise covariance. Background signal strengths are estimated and usedalong with the matched filters to form signatures which are thensubtracted. the maximum likelihood process is applied to this “cleanedup” data in a similar manner employing matched filters for the organismsand a running-sum estimate of the noise-covariance for the cleaned updata.

In one embodiment, a strategy to “triangulate” each organism bymeasuring signals from multiple core genes is used to reduce falsenegative and false positive signals, and enable reconstruction of theorigin or hybrid or otherwise engineered bioagents. After identificationof multiple core genes, alignments are created from nucleic acidsequence databases. The alignments are then analyzed for regions ofconservation and variation, and potential primer binding sites flankingvariable regions are identified. Next, amplification target regions forsignature analysis are selected which distinguishes organisms based onspecific genomic differences (i.e., base composition). For example,detection of signatures for the three part toxin genes typical of B.anthracis (Bowen, J. E. and C. P. Quinn, J. Appl. Microbiol. 1999, 87,270–278) in the absence of the expected signatures from the B. anthracisgenome would suggest a genetic engineering event.

The present method can also be used to detect single nucleotidepolymorphisms (SNPs), or multiple nucleotide polymorphisms, rapidly andaccurately. A SNP is defined as a single base pair site in the genomethat is different from one individual to another. The difference can beexpressed either as a deletion, an insertion or a substitution, and isfrequently linked to a disease state. Because they occur every 100–1000base pairs, SNPs are the most frequently bound type of genetic marker inthe human genome.

For example, sickle cell anemia results from an A–T transition, whichencodes a valine rather than a glutamic acid residue. Oligonucleotideprimers may be designed such that they bind to sequences which flank aSNP site, followed by nucleotide amplification and mass determination ofthe amplified product. Because the molecular masses of the resultingproduct from an individual who does not have sickle cell anemia isdifferent from that of the product from an individual who has thedisease, the method can be used to distinguish the two individuals.Thus, the method can be used to detect any known SNP in an individualand thus diagnose or determine increased susceptibility to a disease orcondition.

In one embodiment, blood is drawn from an individual and peripheralblood mononuclear cells (PBMC) are isolated and simultaneously tested,preferably in a high-throughput screening method, for one or more SNPsusing appropriate primers based on the known sequences which flank theSNP region. The National Center for Biotechnology Information maintainsa publicly available database of SNPs (www.ncbi.nlm.nih.gov/SNP/).

The method of the present invention can also be used for blood typing.The gene encoding A, B or O blood type can differ by four singlenucleotide polymorphisms. If the gene contains the sequenceCGTGGTGACCCTT (SEQ ID NO:5), antigen A results. If the gene contains thesequence CGTCGTCACCGCTA (SEQ ID NO:6) antigen B results. If the genecontains the sequence CGTGGT-ACCCCTT (SEQ ID NO:7), blood group Oresults (“−” indicates a deletion). These sequences can be distinguishedby designing a single primer pair which flanks these regions, followedby amplification and mass determination.

While the present invention has been described with specificity inaccordance with certain of its preferred embodiments, the followingexamples serve only to illustrate the invention and are not intended tolimit the same.

EXAMPLE 1

Nucleic Acid Isolation and PCR

In one embodiment, nucleic acid is isolated from the organisms andamplified by PCR using standard methods prior to BCS determination bymass spectrometry. Nucleic acid is isolated, for example, by detergentlysis of bacterial cells, centrifugation and ethanol precipitation.Nucleic acid isolation methods are described in, for example, CurrentProtocols in Molecular Biology (Ausubel et al.) and Molecular Cloning; ALaboratory Manual (Sambrook et al.). The nucleic acid is then amplifiedusing standard methodology, such as PCR, with primers which bind toconserved regions of the nucleic acid which contain an interveningvariable sequence as described below.

EXAMPLE 2

Mass Spectrometry

FTICR Instrumentation: The FTICR instrument is based on a 7 teslaactively shielded superconducting magnet and modified Bruker DaltonicsApex II 70e ion optics and vacuum chamber. The spectrometer isinterfaced to a LEAP PAL autosampler and a custom fluidics controlsystem for high throughput screening applications. Samples are analyzeddirectly from 96-well or 384-well microtiter plates at a rate of about 1sample/minute. The Bruker data-acquisition platform is supplemented witha lab-built ancillary NT datastation which controls the autosampler andcontains an arbitrary waveform generator capable of generating complexrf-excite waveforms (frequency sweeps, filtered noise, stored waveforminverse Fourier transform (SWIFT), etc.) for sophisticated tandem MSexperiments. For oligonucleotides in the 20–30-mer regime typicalperformance characteristics include mass resolving power in excess of100,000 (FWHM), low ppm mass measurement errors, and an operable m/zrange between 50 and 5000 m/z.

Modified ESI Source: In sample-limited analyses, analyte solutions aredelivered at 150 nL/minute to a 30 mm i.d. fused-silica ESI emittermounted on a 3-D micromanipulator. The ESI ion optics consist of aheated metal capillary, an rf-only hexapole, a skimmer cone, and anauxiliary gate electrode. The 6.2 cm rf-only hexapole is comprised of 1mm diameter rods and is operated at a voltage of 380 Vpp at a frequencyof 5 MHz. A lab-built electromechanical shutter can be employed toprevent the electrospray plume from entering the inlet capillary unlesstriggered to the “open” position via a TTL pulse from the data station.When in the “closed” position, a stable electrospray plume is maintainedbetween the ESI emitter and the face of the shutter. The back face ofthe shutter arm contains an elastomeric seal which can be positioned toform a vacuum seal with the inlet capillary. When the seal is removed, a1 mm gap between the shutter blade and the capillary inlet allowsconstant pressure in the external ion reservoir regardless of whetherthe shutter is in the open or closed position. When the shutter istriggered, a “time slice” of ions is allowed to enter the inletcapillary and is subsequently accumulated in the external ion reservoir.The rapid response time of the ion shutter (<25 ms) providesreproducible, user defined intervals during which ions can be injectedinto and accumulated in the external ion reservoir.

Apparatus for Infrared Multiphoton Dissociation

A 25 watt CW CO₂ laser operating at 10.6 μm has been interfaced to thespectrometer to enable infrared multiphoton dissociation (IRMPD) foroligonucleotide sequencing and other tandem MS applications. An aluminumoptical bench is positioned approximately 1.5 m from the activelyshielded superconducting magnet such that the laser beam is aligned withthe central axis of the magnet. Using standard IR-compatible mirrors andkinematic mirror mounts, the unfocused 3 mm laser beam is aligned totraverse directly through the 3.5 mm holes in the trapping electrodes ofthe FTICR trapped ion cell and longitudinally traverse the hexapoleregion of the external ion guide finally impinging on the skimmer cone.This scheme allows IRMPD to be conducted in an m/z selective manner inthe trapped ion cell (e.g. following a SWIFT isolation of the species ofinterest), or in a broadband mode in the high pressure region of theexternal ion reservoir where collisions with neutral molecules stabilizeIRMPD-generated metastable fragment ions resulting in increased fragmention yield and sequence coverage.

EXAMPLE 3

Identification of Bioagents

Table 1 shows a small cross section of a database of calculatedmolecular masses for over 9 primer sets and approximately 30 organisms.The primer sets were derived from rRNA alignment. Examples of regionsfrom rRNA consensus alignments are shown in FIGS. 1A–1C. Lines witharrows are examples of regions to which intelligent primer pairs for PCRare designed. The primer pairs are >95% conserved in the bacterialsequence database (currently over 10,000 organisms). The interveningregions are variable in length and/or composition, thus providing thebase composition “signature” (BCS) for each organism. Primer pairs werechosen so the total length of the amplified region is less than about80–90 nucleotides. The label for each primer pair represents thestarting and ending base number of the amplified region on the consensusdiagram.

Included in the short bacterial database cross-section in Table 1 aremany well known pathogens/biowarfare agents (shown in bold/red typeface)such as Bacillus anthracis or Yersinia pestis as well as some of thebacterial organisms found commonly in the natural environment such asStreptomyces. Even closely related organisms can be distinguished fromeach other by the appropriate choice of primers. For instance, two lowG+C organisms, Bacillus anthracis and Staph aureus, can be distinguishedfrom each other by using the primer pair defined by 16S_(—)1337 or23S_(—)855 (ΔM of 4 Da).

TABLE 2 Cross Section Of A Database Of Calculated Molecular Masses¹Primer Regions → Bug Name 16S_971 16S_1100 16S_1337 16S_1294 16S_122823S_1021 23S_855 23S_193 23S_115 Acinetobacter calcoaceticus 55619.155004 28446.7 35854.9 51295.4 30299 42654 39557.5 54999

55005 54388 28448 35238 51296 30295 42651 39560 56850 Bacillus cereus55622.1 54387.9 28447.6 35854.9 51296.4 30295 42651 39560.5 56850.3Bordetella bronchiseptica 56857.3 51300.4 28446.7 35857.9 51307.4 3029942653 39559.5 51920.5 Borrelia burgdoferi 56231.2 55621.1 28440.735852.9 51295.4 30297 42029.9 38941.4 52524.6

58098 55011 28448 35854 50683 Campylobacter jejuni 58088.5 54386.929061.8 35856.9 50674.3 30294 42032.9 39558.5 45732.5

55000 55007 29063 35855 50676 30295 42036 38941 56230

55006 53767 28445 35855 51291 30300 42656 39582 54999 Clostridiumdifficile 56855.3 54386.9 28444.7 35853.9 51296.4 30294 41417.8 39556.555612.2 Enterococcus faecalis 55620.1 54387.9 28447.6 35858.9 51296.430297 42652 39559.5 56849.3

55622 55009 28445 35857 51301 30301 42656 39562 54999

53769 54385 28445 35856 51298 Haemophilus influenzae 55620.1 5500626444.7 35855.9 51298.4 30298 42656 39560.5 55613.1 Klebsiellapneumoniae 55622.1 55008 28442.7 35856.9 51297.4 30300 42655 39562.555000

55618 55628 28446 35857 51303 Mycobacterium avium 54390.9 55631.129064.8 35858.9 51915.5 30298 42656 38942.4 56241.2 Myobacterium leprae54389.9 55629.1 29064.8 35860.9 51917.5 30298 42656 39559.5 56240.2Myobacterium tuberculosis 54390.9 55629.1 29064.8 35860.9 51301.4 3029942656 39560.5 56243.2 Mycoplasma genitalium 53143.7 45115.4 29061.835854.9 50671.3 30294 43264.1 39558.5 56842.4 Mycoplasma pneumoniae53143.7 45118.4 29061.8 35854.9 50673.3 30294 43264.1 39559.5 56843.4Neissena gonorrhoeae 55627.1 54389.9 28445.7 35855.9 51302.4 30300 4264939561.5 55000

55623 55010 28443 35858 51301 30298 43272 39558 55619

58093 55621 28448 35853 50677 30293 42650 39559 53139

58094 55623 28448 35853 50679 30293 42648 39559 53755

55622 55005 28445 35857 51301 30301 42658

55623 55009 28444 35857 51301 Staphylococcus aureus 56854.3 54386.928443.7 35852.9 51294.4 30298 42655 39559.5 57466.4 Streptomyces 54389.959341.6 20963.8 35858.9 51300.4 39563.5 56864.3 Treponema pallidum56245.2 55631.1 28445.7 35851.9 51297.4 30299 42034.9 38939.4 57473.4

55625 55626 28443 35857 52536 29063 30303 35241 50675 Vibrioparahaemolyticus 54384.9 55626.1 28444.7 34620.7 50064.2

55620 55626 28443 35857 51299 ¹Molecular mass distribution of PCRamplified regions for a selection of organisms (rows) across variousprimer pairs (columns). Pathogens are shown in bold. Empty cellsindicate presently incomplete or missing data.

FIG. 6 shows the use of ESI-FT-ICR MS for measurement of exact mass. Thespectra from 46mer PCR products originating at position 1337 of the 16SrRNA from S. aureus (upper) and B. anthracis (lower) are shown. Thesedata are from the region of the spectrum containing signals from the[M-8H+]⁸⁻ charge states of the respective 5′-3′ strands. The two strandsdiffer by two (AT→CG) substitutions, and have measured masses of14206.396 and 14208.373±0.010 Da, respectively. The possible basecompositions derived from the masses of the forward and reverse strandsfor the B. anthracis products are listed in Table 3.

TABLE 3 Possible base composition for B. anthracis products Calc. MassError Base Comp. 14208.2935 0.079520 A1 G17 C10 T18 14208.3160 0.056980A1 G20 C15 T10 14208.3386 0.034440 A1 G23 C20 T2 14208.3074 0.065560 A6G11 C3 T26 14208.3300 0.043020 A6 G14 C8 T18 14208.3525 0.020480 A6 G17C13 T10 14208.3751 0.002060 A6 G20 C18 T2 14208.3439 0.029060 A11 G8 C1T26 14208.3665 0.006520 A11 G11 C6 T18 14208.3890 0.016020 A11 G14 C11T10 14208.4116 0.038560 A11 G17 C16 T2 14208.4030 0.029980 A16 G8 C4 T1814208.4255 0.052520 A16 G11 C9 T10 14208.4481 0.075060 A16 G14 C14 T214208.4395 0.066480 A21 G5 C2 T18 14208.4620 0.089020 A21 G8 C7 T1014079.2624 0.080600 A0 G14 C13 T19 14079.2849 0.058060 A0 G17 C18 T1114079.3075 0.035520 A0 G20 C23 T3 14079.2538 0.089180 A5 G5 C1 T3514079.2764 0.066640 A5 G8 C6 T27 14079.2989 0.044100 A5 G11 C11 T1914079.3214 0.021560 A5 G14 C16 T11 14079.3440 0.000980 A5 G17 C21 T314079.3129 0.030140 A10 G5 C4 T27 14079.3354 0.007600 A10 G8 C9 T1914079.3579 0.014940 A10 G11 C14 T11 14079.3805 0.037480 A10 G14 C19 T314079.3494 0.006360 A15 G2 C2 T27 14079.3719 0.028900 A15 G5 C7 T1914079.3944 0.051440 A15 G8 C12 T11 14079.4170 0.073980 A15 G11 C17 T314079.4084 0.065400 A20 G2 C5 T19 14079.4309 0.087940 A20 G5 C10 T13Among the 16 compositions for the forward strand and the 18 compositionsfor the reverse strand that were calculated, only one pair (shown inbold) are complementary, corresponding to the actual base compositionsof the B. anthracis PCR products.

EXAMPLE 4

BCS of Region from Bacillus anthracis and Bacillus cereus

A conserved Bacillus region from B. anthracis (A₁₄G₉C₁₄T₉) and B. cereus(A₁₅G₉C₁₃T₉) having a C to A base change was synthesized and subjectedto ESI-TOF MS. The results are shown in FIG. 7 in which the two regionsare clearly distinguished using the method of the present invention(MW=14072.26 vs. 14096.29).

EXAMPLE 5

Identification of Additional Bioagents

In other examples of the present invention, the pathogen Vibrio choleracan be distinguished from Vibrio parahemolyticus with ΔM>600 Da usingone of three 16S primer sets shown in Table 2 (16S_(—)971, 16S_(—)1228or 16S_(—)1294) as shown in Table 4. The two mycoplasma species in thelist (M. genitalium and M. pneumoniae) can also be distinguished fromeach other, as can the three mycobacteriae. While the direct massmeasurements of amplified products can identify and distinguish a largenumber of organisms, measurement of the base composition signatureprovides dramatically enhanced the base composition signature providesdramatically enhanced resolving power for closely related organisms. Incases such as Bacillus anthracis and Bacillus cereus that are virtuallyindistinguishable from each other based solely on mass differences,compositional analysis or fragmentation patterns are used to resolve thedifferences. The single base difference between the two organisms yieldsdifferent fragmentation patterns, and despite the presence of theambiguous/unidentified base N at position 20 in B. anthracis, the twoorganisms can be identified.

Tables 4a–b show examples of primer pairs from Table 1 which distinguishpathogens from background.

TABLE 4a Organism name 23S_855 16S_1337 23S_1021 Bacillus anthracis42650.98 28447.65 30294.98 Staphylococcus aureus 42654.97 28443.6730297.96

TABLE 4b Organism name 16S_971 16S_1294 16S_1228 Vibrio cholerae55625.09 35856.87 52535.59 Vibrio parahaemolyticus 54384.91 34620.6750064.19

Table 4 shows the expected molecular weight and base composition ofregion 16S 1100–1188 in Mycobacterium avium and Streptomyces sp.

TABLE 5 Molecular Base Region Organism name Length weight comp.16S_1100-1188 Mycobacterium avium 82 25624.1728 A₁₆G₃₂C₁₈T₁₆16S_1100-1188 Streptomyces sp. 96 29904.871 A₁₇G₃₈C₂₇T₁₄

Table 6 shows base composition (single strand) results for 16—S₁₃1100–1188 primer amplification reactions for different species ofbacteria. Species which are repeated in the table (e.g., Clostridiumbotulinum) are different strains which have different base compositionsin the 16S_(—)1100–1188 region.

TABLE 6 Base Base Organism name comp. Organism name comp. Mycobacteriumavium A₁₆G₃₂C₁₈T₁₆ Vibrio cholerae A₂₃G₃₀C₂₁T₁₆ Streptomyces sp.A₁₇G₃₈C₂₇T₁₄

A ₂₃G₃₁C₂₁T₁₅ Ureaplasma A₁₈G₃₀C₁₇T₁₇

A ₂₃G₃₁C₂₁T₁₅ urealyticum Streptomyces sp. A₁₉G₃₆C₂₄T₁₈ Mycoplasmagenitalium A₂₄G₁₉C₁₂T₁₈ Mycobacterium leprae A₂₀G₃₂C₂₂T₁₆ Clostridiumbotulinum A₂₄G₂₅C₁₈T₂₀

A ₂₀G₃₃C₂₁T₁₆ Bordetella A₂₄G₂₆C₁₉T₁₄ bronchiseptica

A ₂₀G₃₃C₂₁T₁₆ Francisella A₂₄G₂₆C₁₉T₁₉ tularensis FusobacteriumA₂₁G₂₆C₂₂T₁₈

A ₂₄G₂₆C₂₀T₁₈ necroforum Listeria A₂₁G₂₇C₁₉T₁₉

A ₂₄G₂₆C₂₀T₁₈ monocytogenes Clostridium botulinum A₂₁G₂₇C₁₉T₂₁

A ₂₄G₂₆C₂₀T₁₈ Neisseria gonorrhoeae A₂₁G₂₈C₂₁T₁₈ Helicobacter pyloriA₂₄G₂₆C₂₀T₁₉ Bartonella quintana A₂₁G₃₀C₂₂T₁₆ Helicobacter pyloriA₂₄G₂₆C₂₁T₁₈ Enterococcus faecalis A₂₂G₂₇C₂₀T₁₉ Moraxella catarrhalisA₂₄G₂₆C₂₃T₁₆ Bacillus megaterium A₂₂G₂₈C₂₀T₁₈ Haemophilus A₂₄G₂₈C₂₀T₁₇influenzae Rd Bacillus subtilis A₂₂G₂₈C₂₁T₁₇

A ₂₄G₂₈C₂₁T₁₆ Pseudomonas A₂₂G₂₉C₂₃T₁₅

A ₂₄G₂₈C₂₁T₁₆ aeruginosa

Legionella A₂₂G₃₂C₂₀T₁₆

AR39 A ₂₄G₂₈C₂₁T₁₆ pneumophila Mycoplasma pneumoniae A₂₃G₂₀C₁₄T₁₆Pseudomonas putida A₂₄G₂₉C₂₁T₁₆ Clostridium botulinum A₂₃G₂₆C₂₀T₁₉

A ₂₄G₃₀C₂₁T₁₅ Enterococcus faecium A₂₃G₂₆C₂₁T₁₈

A ₂₄G₃₀C₂₁T₁₅ Acinetobacter A₂₃G₂₆C₂₁T₁₉

A ₂₄G₃₀C₂₁T₁₅ calcoaceti

A ₂₃G₂₆C₂₄T₁₅ Clostridium botulinum A₂₅G₂₄C₁₈T₂₁

A ₂₃G₂₆C₂₄T₁₅ Clostridium tetani A₂₅G₂₅C₁₈T₂₀

Clostridium A₂₃G₂₇C₁₉T₁₉ Francisella A₂₅G₂₅C₁₉T₁₉ perfringens tularensis

A ₂₃G₂₇C₂₀T₁₈ Acinetobacter A₂₅G₂₆C₂₀T₁₉ calcoacetic

A ₂₃G₂₇C₂₀T₁₈ Bacteriodes fragilis A₂₅G₂₇C₁₆T₂₂

A ₂₃G₂₇C₂₀T₁₈ Chlamydophila A₂₅G₂₇C₂₁T₁₆

psittaci Aeromonas hydrophila A₂₃G₂₉C₂₁T₁₆ Borrelia burgdorferiA₂₅G₂₉C₁₇T₁₉ Escherichia coli A₂₃G₂₉C₂₁T₁₆ Streptobacillus A₂₆G₂₆C₂₀T₁₆monilifor Pseudomonas putida A₂₃G₂₉C₂₁T₁₇ Rickettsia prowazekiiA₂₆G₂₈C₁₈T₁₈

A ₂₃G₂₉C₂₂T₁₅ Rickettsia rickettsii A₂₆G₂₈C₂₀T₁₆

A ₂₃G₂₉C₂₂T₁₅ Mycoplasma mycoides A₂₈G₂₃C₁₆T₂₀

The same organism having different base compositions are differentstrains. Groups of organisms which are highlighted or in italics havethe same base compositions in the amplified region. Some of theseorganisms can be distinguished using multiple primers. For example,Bacillus anthracis can be distinguished from Bacillus cereus andBacillus thuringiensis using the primer 16S_(—)971–1062 (Table 6). Otherprimer pairs which produce unique base composition signatures are shownin Table 6 (bold). Clusters containing very similar threat andubiquitous non-threat organisms (e.g. anthracis cluster) aredistinguished at high resolution with focused sets of primer pairs. Theknown biowarfare agents in Table 6 are Bacillus anthracis, Yersiniapestis, Francisella tularensis and Rickettsia prowazekii.

TABLE 7 16S_1228- 16S_1100- Organism 16S_971-1062 1310 1188 Aeromonashydrophila A₂₁G₂₉C₂₂T₂₀ A₂₂G₂₇C₂₁T₁₃ A₂₃G₃₁C₂₁T₁₅ Aeromonas A₂₁G₂₉C₂₂T₂₀A₂₂G₂₇C₂₁T₁₃ A₂₃G₃₁C₂₁T₁₅ salmonicida Bacillus anthracis A ₂₁ G ₂₇ C ₂₂T ₂₂ A₂₄G₂₂C₁₉T₁₈ A₂₃G₂₇C₂₀T₁₈ Bacillus cereus A₂₂G₂₇C₂₁T₂₂ A₂₄G₂₂C₁₉T₁₈A₂₃G₂₇C₂₀T₁₈ Bacillus A₂₂G₂₇C₂₁T₂₂ A₂₄G₂₂C₁₉T₁₈ A₂₃G₂₇C₂₀T₁₈thuringiensis Chlamydia A ₂₂ G ₂₆ C ₂₀ T ₂₃ A ₂₄ G ₂₃ C ₁₉ T ₁₆A₂₄G₂₈C₂₁T₁₆ trachomatis Chlamydia A₂₆G₂₃C₂₀T₂₂ A₂₆G₂₂C₁₆T₁₈A₂₄G₂₈C₂₁T₁₆ pneumoniae AR39 Leptospira A₂₂G₂₆C₂₀T₂₁ A₂₂G₂₅C₂₁T₁₅A₂₃G₂₆C₂₄T₁₅ borgpetersenii Leptospira A₂₂G₂₆C₂₀T₂₁ A₂₂G₂₅C₂₁T₁₅A₂₃G₂₆C₂₄T₁₅ interrogans Mycoplasma A₂₈G₂₃C₁₅T₂₂ A ₃₀ G ₁₈ C ₁₅ T ₁₉ A₂₄ G ₁₉ C ₁₂ T ₁₈ genitalium Mycoplasma A₂₈G₂₃C₁₅T₂₂ A ₂₇ G ₁₉ C ₁₆ T ₂₀A ₂₃ G ₂₀ C ₁₄ T ₁₆ pneumoniae Escherichia coli A ₂₂ G ₂₈ C ₂₀ T ₂₂A₂₄G₂₅C₂₁T₁₃ A₂₃G₂₉C₂₂T₁₅ Shigella dysenteriae A ₂₂ G ₂₈ C ₂₁ T ₂₁A₂₄G₂₅C₂₁T₁₃ A₂₃G₂₉C₂₂T₁₅ Proteus vulgaris A ₂₃ G ₂₆ C ₂₂ T ₂₁ A ₂₆ G ₂₄C ₁₉ T ₁₄ A₂₄G₃₀C₂₁T₁₅ Yersinia pestis A₂₄G₂₅C₂₁T₂₂ A₂₅G₂₄C₂₀T₁₄A₂₄G₃₀C₂₁T₁₅ Yersinia A₂₄G₂₅C₂₁T₂₂ A₂₅G₂₄C₂₀T₁₄ A₂₄G₃₀C₂₁T₁₅pseudotuberculosis Francisella A ₂₀ G ₂₅ C ₂₁ T ₂₃ A ₂₃ G ₂₆ C ₁₇ T ₁₇ A₂₄ G ₂₆ C ₁₉ T ₁₉ tularensis Rickettsia A ₂₁ G ₂₆ C ₂₄ T ₂₅ A ₂₄ G ₂₃ C₁₆ T ₁₉ A ₂₆ G ₂₈ C ₁₈ T ₁₈ prowazekii Rickettsia A ₂₁ G ₂₆ C ₂₅ T ₂₄ A₂₄ G ₂₄ C ₁₇ T ₁₇ A ₂₆ G ₂₈ C ₂₀ T ₁₆ rickettsii

The sequence of B. anthracis and B. cereus in region 16S_(—)971 is shownbelow. Shown in bold is the single base difference between the twospecies which can be detected using the methods of the presentinvention. B. anthracis has an ambiguous base at position 20.

B.anthracis_16S_971GCGAAGAACCUUACCAGGUNUUGACAUCCUCUGACAACCCUAGAGAUAGGGCUUCUCCUUC (SEQ IDNO: 1) GGGAGCAGAGUGACAGGUGGUGCAUGGUU B.cereus_16S_971GCGAAGAACCUUACCAGGUCUUGACAUCCUCUGAAAACCCUAGAGAUAGGGCUUCUCCUUC (SEQ IDNO: 2) GGGAGCAGAGUGACAGGUGGUGCAUGGUU

EXAMPLE 6

ESI-TOF MS of sspE 56-mer Plus Calibrant

The mass measurement accuracy that can be obtained using an internalmass standard in the ESI-MS study of PCR products is shown in FIG. 8.The mass standard was a 20-mer phosphorothioate oligonucleotide added toa solution containing a 56-mer PCR product from the B. anthracis sporecoat protein sspE. The mass of the expected PCR product distinguishes B.anthracis from other species of Bacillus such as B. thuringiensis and B.cereus.

EXAMPLE 7

B. anthracis ESI-TOF synthetic 16S_(—)1228 Duplex

An ESI-TOF MS spectrum was obtained from an aqueous solution containing5 μM each of synthetic analogs of the expected forward and reverse PCRproducts from the nucleotide 1228 region of the B. anthracis 16S rRNAgene. The results (FIG. 9) show that the molecular weights of theforward and reverse strands can be accurately determined and easilydistinguish the two strands. The [M-21H⁺]²¹⁻ and [M-20H⁺] ²⁰⁻ chargestates are shown.

EXAMPLE 8

ESI-FTICR-MS of Synthetic B. anthracis 16S_(—)1337 46 Base Pair Duplex

An ESI-FTICR-MS spectrum was obtained from an aqueous solutioncontaining 5 μM each of synthetic analogs of the expected forward andreverse PCR products from the nucleotide 1337 region of the B. anthracis16S rRNA gene. The results (FIG. 10) show that the molecular weights ofthe strands can be distinguished by this method. The [M-16H⁺]¹⁶⁻ through[M-10H⁺]¹⁰⁻ charge states are shown. The insert highlights theresolution that can be realized on the FTICR-MS instrument, which allowsthe charge state of the ion to be determined from the mass differencebetween peaks differing by a single 13C substitution.

EXAMPLE 9

ESI-TOF MS of 56-mer Oligonucleotide from saspB Gene of B. anthraciswith Internal Mass Standard

ESI-TOF MS spectra were obtained on a synthetic 56-mer oligonucleotide(5 μM)from the saspB gene of B. anthracis containing an internal massstandard at an ESI of 1.7 μL/min as a function of sample consumption.The results (FIG. 11) show that the signal to noise is improved as morescans are summed, and that the standard and the product are visibleafter only 100 scans.

EXAMPLE 10

ESI-TOF MS of an Internal Standard with Tributylammonium(TBA)-Trifluoroacetate (TFA) Buffer

An ESI-TOF-MS spectrum of a 20-mer phosphorothioate mass standard wasobtained following addition of 5 mM TBA–TFA buffer to the solution. Thisbuffer strips charge from the oligonucleotide and shifts the mostabundant charge state from [M-8H⁺]⁸⁻ to [M-3H⁺]³⁻ (FIG. 12).

1. A method of identifying a bacterial bioagent comprising: contactingnucleic acid from the bioagent with at least one pair of primers whichhybridize to flanking sequences of the nucleic acid, wherein theflanking sequences flank a variable nucleic acid sequence of thebioagent; amplifying the variable nucleic acid sequence to produce anamplification product; determining the molecular mass of theamplification product by mass spectrometry; and comparing the molecularmass of the amplification product to calculated or measured molecularmasses of analogous amplification products of one or more knownbacterial bioagents present in a database comprising 19 or moremolecular masses, with the proviso that sequencing of the amplificationproduct is not used to identify the bacterial bioagent.
 2. A method ofidentifying a bacterial bioagent comprising: contacting nucleic acidfrom the bioagent with at least one pair of primers which hybridize toflanking sequences of the nucleic acid, wherein the flanking sequencesflank a variable nucleic acid sequence of the bioagent; amplifying thevariable nucleic acid sequence to produce an amplification product;determining the base composition of the amplification product by massspectrometry; and comparing the base composition of the amplificationproduct to calculated or measured base composition of analogousamplification products of one or more known bacterial bioagents presentin a database comprising 19 or more base compositions, with the provisothat sequencing of the amplification product is not used to identify thebacterial bioagent.
 3. The method of claim 1 or claim 2 wherein theamplifying step comprises the polymerase chain reaction.
 4. The methodof claim 1 or claim 2 wherein the nucleic acid encodes ribosomal RNA ora protein involved in translation, replication, recombination andrepair, transcription, nucleotide metabolism, amino acid metabolism,lipid metabolism, energy generation, uptake or secretion.
 5. The methodof claims 1 or 2 wherein the mass spectrometry is Fourier transform ioncyclotron resonance mass spectrometry (FT-ICR-MS) or time of flight massspectrometry (TOF-MS).
 6. The method of claim 1 or claim 2 wherein thevariable nucleic acid sequence exhibits no greater than about 5%identity among bacterial bioagents.
 7. The method of claim 1 or claim 2wherein the sequences to which the primers hybridize are separated bybetween about 60–100 nucleotides.
 8. The method of claim 1 or claim 2wherein the flanking sequences are between about 80 and 100% identicalamong bacterial bioagents.
 9. The method of claim 1 or claim 2 whereinthe flanking sequences are greater than about 95% identical amongbacterial bioagents.
 10. A method of identifying a bacterial bioagentcomprising: contacting nucleic acid from the bacterial bioagent with atleast one pair of primers which hybridize to flanking sequences of thenucleic acid, wherein each member of the pair of primers hybridizes toone hundred or more bacterial bioagents wherein the flanking sequencesflank a variable nucleic acid sequence of the one hundred or morebacterial bioagents; amplifying the variable nucleic acid sequence toproduce an amplification product; determining the molecular mass or basecomposition of the amplification product by mass spectrometry; andcomparing the molecular mass to calculated or measured molecular massesor base compositions of analogous amplification products of more thanone known bacterial bioagents, thereby identifying the bacterialbioagent.
 11. The method of claim 10 wherein the bacterial bioagent is amember of the genus Acinetobacter, Aeromonas, Bacillus, Bacteriodes,Bartonella, Bordetella, Borrelia, Brucella, Burkholderia, Campylobacter,Chiamydia, Chiamydophila, Clostridium, Coxiella, Enterococcus,Escherichia, Francisella, Fusobacterium, Haemophilus, Helicobacter,Kiebsiella, Legionella, Leptospira, Listeria, Moraxella, Mycobacterium,Mycoplasma, Neisseria, Proteus, Pseudomonas, Rhodobacter, Rickettsia,Salmonella, Shigella, Staphylococcus, Streptobacillus, Streptomyces,Treponema, Ureaplasma, Vibrio, or Yersinia.
 12. The method of claim 10wherein the amplifying step comprises the polymerase chain reaction. 13.The method of claim 10 wherein the nucleic acid encodes ribosomal RINAor a protein involved in translation, replication, recombination andrepair, transcription, nucleotide metabolism, amino acid metabolism,lipid metabolism, energy generation, uptake or secretion.
 14. The methodof claim 10 wherein the mass spectrometry is Fourier transform ioncyclotron resonance mass spectrometry (FT-ICR-MS) or time of flight massspectrometry (TOF-MS).
 15. The method of claim 10 wherein the molecularmasses of the more than one known bacterial bioagents are contained in adatabase of molecular masses.
 16. The method of claim 10 wherein thebase compositions of the more than one known bacterial bioagents arecontained in a database of base compositions.
 17. The method of claim 10wherein the variable nucleic acid sequence exhibits no greater thanabout 5% identity among bacterial bioagents.
 18. The method of claim 10wherein the sequences to which the primers hybridize are separated bybetween about 60–100 nucleotides.
 19. The method of claim 10 wherein theflanking sequences are between about 80 and 100% identical amongbacterial bioagents.
 20. The method of claim 10 wherein the flankingsequences are greater than about 95% identical among bacterialbioagents.
 21. The method of claims 1 or 2 wherein the bacterialbioagent is a member of the genus Acinetobacter, Aeromonas, Bacillus,Bacteriodes, Bartonella, Bordetella, Borrelia, Brucella, Burkholderia,Campylobacter, Chiamydia, Chiamydophila, Clostridium, Coxiella,Enterococcus, Escherichia, Francisella, Fusobacterium, Haemophilus,Helicobacter, Kiebsiella, Legionella, Leptospira, Listeria, Moraxella,Mycobacterium, Mycoplasma, Neisseria, Proteus, Pseudomonas, Rhodobacter,Ricketsia, Salmonella, Shigella, Staphylococcus, Streptobacillus,Streptomyces, Treponema, Ureaplasma, Vibrio, or Yersinia.
 22. The methodof claims 1, 2 or 10 wherein the bacterial bioagent is identified at thespecies level.
 23. The method of claims 1, 2 or 10 wherein the bacterialbioagent is identified at the strain level.
 24. The method of claims 1,2 or 10 wherein the bacterial bioagent is a biological warfare agent.25. The method of claim 24 wherein the biological warfare agent isBacillus anthracis, Yersinia pestis, Franciscella tularensis, Brucellasuis, Brucella abortus, Brucella melitensis, Burkholderia mallei,Burkholderia pseudomalleii, Salmonella typhi, Rickettsia typhii,Rickettsia prowasekii, Coxiella burnetii, Rhodobacter capsulatus,Chlamydia pneumoniae, Escherichia coli, Shigella dysenteriae,Shigellaflexneri, Bacillus cereus, Clostridium botulinum, Coxiellaburnetti, Pseudomonas aeruginosa, Legionella pneumophila, or Vibriocholerae.
 26. The method of claims 1, 2 or 10 wherein the pair ofprimers comprises at least one nucleotide analog.
 27. The method ofclaim 26 wherein the nucleotide analog is inosine, uridine,2,6-diaminopurine, propyne C, or propyne T.
 28. The method of claims 1,2 or 10 wherein a molecular mass-modifying tag is incorporated into theamplification product to limit the number of possible base compositionsconsistent with the mass of the amplification product.
 29. The method ofclaim 13 wherein the protein is DNA polymerase, RNA polymerase,elongation factor Tu, heat shock protein groEL, phosphoglycerate kinase,NADH dehydrogenase, DNA ligase, DNA topoisomerase, or elongation factorG.